Overview

Dataset statistics

Number of variables13
Number of observations2774
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory271.0 KiB
Average record size in memory100.0 B

Variable types

Numeric13

Alerts

gross_revenue is highly correlated with amount_invoices and 3 other fieldsHigh correlation
amount_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
amount_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
amount_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with amount_products and 1 other fieldsHigh correlation
gross_revenue is highly correlated with amount_invoices and 1 other fieldsHigh correlation
amount_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
amount_items is highly correlated with gross_revenue and 1 other fieldsHigh correlation
amount_products is highly correlated with amount_invoicesHigh correlation
avg_ticket is highly correlated with amount_returns and 1 other fieldsHigh correlation
amount_returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
gross_revenue is highly correlated with amount_invoices and 2 other fieldsHigh correlation
amount_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
amount_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
amount_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with amount_itemsHigh correlation
df_index is highly correlated with avg_recency_daysHigh correlation
gross_revenue is highly correlated with amount_invoices and 5 other fieldsHigh correlation
amount_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
amount_items is highly correlated with gross_revenue and 5 other fieldsHigh correlation
amount_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 3 other fieldsHigh correlation
amount_returns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_recency_days is highly correlated with df_indexHigh correlation
avg_basket_size is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 51.90076423) Skewed
amount_returns is highly skewed (γ1 = 50.04036419) Skewed
frequency is highly skewed (γ1 = 46.08539806) Skewed
avg_basket_size is highly skewed (γ1 = 44.86093386) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 34 (1.2%) zeros Zeros
amount_returns has 1444 (52.1%) zeros Zeros

Reproduction

Analysis started2022-05-17 20:33:17.306757
Analysis finished2022-05-17 20:33:38.413652
Duration21.11 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct2774
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2213.943764
Minimum0
Maximum5607
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:38.520834image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile179.65
Q1878.25
median2016.5
Q33356.75
95-th percentile4874.15
Maximum5607
Range5607
Interquartile range (IQR)2478.5

Descriptive statistics

Standard deviation1500.217091
Coefficient of variation (CV)0.6776220405
Kurtosis-0.9545355614
Mean2213.943764
Median Absolute Deviation (MAD)1219.5
Skewness0.3794770505
Sum6141480
Variance2250651.319
MonotonicityStrictly increasing
2022-05-17T17:33:38.626869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
28721
 
< 0.1%
28581
 
< 0.1%
28601
 
< 0.1%
28611
 
< 0.1%
28641
 
< 0.1%
28651
 
< 0.1%
28681
 
< 0.1%
28691
 
< 0.1%
28701
 
< 0.1%
Other values (2764)2764
99.6%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
56071
< 0.1%
55971
< 0.1%
55911
< 0.1%
55661
< 0.1%
55601
< 0.1%
55491
< 0.1%
55481
< 0.1%
55321
< 0.1%
55311
< 0.1%
55221
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2774
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15285.69971
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.0 KiB
2022-05-17T17:33:38.739905image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12626.65
Q113815.25
median15242.5
Q316779.75
95-th percentile17950.35
Maximum18287
Range5940
Interquartile range (IQR)2964.5

Descriptive statistics

Standard deviation1714.984904
Coefficient of variation (CV)0.1121953811
Kurtosis-1.206915065
Mean15285.69971
Median Absolute Deviation (MAD)1483.5
Skewness0.01599078757
Sum42402531
Variance2941173.222
MonotonicityNot monotonic
2022-05-17T17:33:38.848947image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
151721
 
< 0.1%
135161
 
< 0.1%
143231
 
< 0.1%
180791
 
< 0.1%
147001
 
< 0.1%
150271
 
< 0.1%
170291
 
< 0.1%
132201
 
< 0.1%
159851
 
< 0.1%
Other values (2764)2764
99.6%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182651
< 0.1%
182631
< 0.1%
182611
< 0.1%
182601
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2760
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2904.751532
Minimum36.56
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:38.964326image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum36.56
5-th percentile264.557
Q1628.9125
median1170.87
Q32424.715
95-th percentile7579.4915
Maximum279138.02
Range279101.46
Interquartile range (IQR)1795.8025

Descriptive statistics

Standard deviation10927.21927
Coefficient of variation (CV)3.761843017
Kurtosis331.9508666
Mean2904.751532
Median Absolute Deviation (MAD)688.765
Skewness16.26093044
Sum8057780.75
Variance119404120.9
MonotonicityNot monotonic
2022-05-17T17:33:39.071438image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
731.92
 
0.1%
1078.962
 
0.1%
734.942
 
0.1%
1353.742
 
0.1%
178.962
 
0.1%
598.22
 
0.1%
2053.022
 
0.1%
1314.452
 
0.1%
745.062
 
0.1%
379.652
 
0.1%
Other values (2750)2754
99.3%
ValueCountFrequency (%)
36.561
< 0.1%
521
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
70.021
< 0.1%
77.41
< 0.1%
84.651
< 0.1%
90.31
< 0.1%
93.351
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

ZEROS

Distinct252
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.62689257
Minimum0
Maximum372
Zeros34
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:39.424638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q110
median29
Q373
95-th percentile211
Maximum372
Range372
Interquartile range (IQR)63

Descriptive statistics

Standard deviation68.41964137
Coefficient of variation (CV)1.208253504
Kurtosis3.432018391
Mean56.62689257
Median Absolute Deviation (MAD)23.5
Skewness1.898344739
Sum157083
Variance4681.247326
MonotonicityNot monotonic
2022-05-17T17:33:39.536682image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.6%
487
 
3.1%
385
 
3.1%
285
 
3.1%
876
 
2.7%
1067
 
2.4%
966
 
2.4%
765
 
2.3%
1762
 
2.2%
2255
 
2.0%
Other values (242)2027
73.1%
ValueCountFrequency (%)
034
 
1.2%
199
3.6%
285
3.1%
385
3.1%
487
3.1%
543
1.6%
765
2.3%
876
2.7%
966
2.4%
1067
2.4%
ValueCountFrequency (%)
3721
 
< 0.1%
3661
 
< 0.1%
3601
 
< 0.1%
3583
0.1%
3541
 
< 0.1%
3371
 
< 0.1%
3362
0.1%
3341
 
< 0.1%
3332
0.1%
3301
 
< 0.1%

amount_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct55
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.053352559
Minimum2
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:39.655721image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q12
median4
Q36
95-th percentile17
Maximum206
Range204
Interquartile range (IQR)4

Descriptive statistics

Standard deviation9.071461768
Coefficient of variation (CV)1.498584739
Kurtosis183.9551027
Mean6.053352559
Median Absolute Deviation (MAD)2
Skewness10.62505905
Sum16792
Variance82.29141862
MonotonicityNot monotonic
2022-05-17T17:33:39.774282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2780
28.1%
3499
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
Other values (45)278
 
10.0%
ValueCountFrequency (%)
2780
28.1%
3499
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

amount_items
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1639
Distinct (%)59.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1700.379957
Minimum2
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:39.899839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile119.65
Q1330.25
median705.5
Q31478.75
95-th percentile4645.5
Maximum196844
Range196842
Interquartile range (IQR)1148.5

Descriptive statistics

Standard deviation6079.161482
Coefficient of variation (CV)3.575178276
Kurtosis437.6447231
Mean1700.379957
Median Absolute Deviation (MAD)453.5
Skewness17.32001834
Sum4716854
Variance36956204.33
MonotonicityNot monotonic
2022-05-17T17:33:40.014857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
2468
 
0.3%
1508
 
0.3%
3007
 
0.3%
3947
 
0.3%
2197
 
0.3%
4937
 
0.3%
2607
 
0.3%
2007
 
0.3%
5167
 
0.3%
Other values (1629)2698
97.3%
ValueCountFrequency (%)
21
< 0.1%
161
< 0.1%
171
< 0.1%
191
< 0.1%
201
< 0.1%
251
< 0.1%
272
0.1%
301
< 0.1%
321
< 0.1%
332
0.1%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%

amount_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct467
Distinct (%)16.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean129.7433309
Minimum2
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:40.136884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile10
Q134
median72
Q3143
95-th percentile400.05
Maximum7838
Range7836
Interquartile range (IQR)109

Descriptive statistics

Standard deviation277.7854086
Coefficient of variation (CV)2.141038053
Kurtosis336.8230491
Mean129.7433309
Median Absolute Deviation (MAD)45
Skewness15.34866005
Sum359908
Variance77164.73323
MonotonicityNot monotonic
2022-05-17T17:33:40.250909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2838
 
1.4%
3534
 
1.2%
2930
 
1.1%
2630
 
1.1%
2730
 
1.1%
1527
 
1.0%
2527
 
1.0%
1927
 
1.0%
3127
 
1.0%
3326
 
0.9%
Other values (457)2478
89.3%
ValueCountFrequency (%)
211
0.4%
313
0.5%
416
0.6%
516
0.6%
624
0.9%
714
0.5%
813
0.5%
919
0.7%
1019
0.7%
1123
0.8%
ValueCountFrequency (%)
78381
< 0.1%
56731
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2772
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.33677308
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:40.367944image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.852702153
Q112.42379049
median17.94212763
Q325.07465812
95-th percentile88.42744262
Maximum56157.5
Range56155.34941
Interquartile range (IQR)12.65086763

Descriptive statistics

Standard deviation1071.049203
Coefficient of variation (CV)20.46456325
Kurtosis2718.321218
Mean52.33677308
Median Absolute Deviation (MAD)6.338589039
Skewness51.90076423
Sum145182.2085
Variance1147146.395
MonotonicityNot monotonic
2022-05-17T17:33:40.474467image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.478333332
 
0.1%
4.1622
 
0.1%
18.152222221
 
< 0.1%
34.631016951
 
< 0.1%
7.6033333331
 
< 0.1%
23.904926471
 
< 0.1%
31.57751
 
< 0.1%
16.807222221
 
< 0.1%
28.755669291
 
< 0.1%
19.43739131
 
< 0.1%
Other values (2762)2762
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%
615.751
< 0.1%

amount_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct205
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.14563807
Minimum0
Maximum80995
Zeros1444
Zeros (%)52.1%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:40.591024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q39
95-th percentile98.7
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1565.044043
Coefficient of variation (CV)24.02377334
Kurtosis2581.837514
Mean65.14563807
Median Absolute Deviation (MAD)0
Skewness50.04036419
Sum180714
Variance2449362.857
MonotonicityNot monotonic
2022-05-17T17:33:40.706050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01444
52.1%
1145
 
5.2%
2120
 
4.3%
384
 
3.0%
473
 
2.6%
558
 
2.1%
658
 
2.1%
845
 
1.6%
1242
 
1.5%
741
 
1.5%
Other values (195)664
23.9%
ValueCountFrequency (%)
01444
52.1%
1145
 
5.2%
2120
 
4.3%
384
 
3.0%
473
 
2.6%
558
 
2.1%
658
 
2.1%
741
 
1.5%
845
 
1.6%
935
 
1.3%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80601
< 0.1%
46271
< 0.1%
37681
< 0.1%
33351
< 0.1%
29751
< 0.1%
20221
< 0.1%
20121
< 0.1%
19201
< 0.1%

avg_recency_days
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1155
Distinct (%)41.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-78.79449884
Minimum-366
Maximum-1
Zeros0
Zeros (%)0.0%
Negative2774
Negative (%)100.0%
Memory size21.8 KiB
2022-05-17T17:33:40.828049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-366
5-th percentile-224
Q1-99
median-59
Q3-34.14930556
95-th percentile-13
Maximum-1
Range365
Interquartile range (IQR)64.85069444

Descriptive statistics

Standard deviation66.52001781
Coefficient of variation (CV)-0.844221599
Kurtosis3.673385052
Mean-78.79449884
Median Absolute Deviation (MAD)30
Skewness-1.828126135
Sum-218575.9398
Variance4424.912769
MonotonicityNot monotonic
2022-05-17T17:33:40.939074image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-7021
 
0.8%
-4618
 
0.6%
-5517
 
0.6%
-9116
 
0.6%
-4916
 
0.6%
-3116
 
0.6%
-4215
 
0.5%
-3515
 
0.5%
-2115
 
0.5%
-1414
 
0.5%
Other values (1145)2611
94.1%
ValueCountFrequency (%)
-3661
 
< 0.1%
-3651
 
< 0.1%
-3641
 
< 0.1%
-3631
 
< 0.1%
-3572
0.1%
-3561
 
< 0.1%
-3552
0.1%
-3521
 
< 0.1%
-3512
0.1%
-3503
0.1%
ValueCountFrequency (%)
-19
0.3%
-24
0.1%
-2.8615384621
 
< 0.1%
-36
0.2%
-3.3303571431
 
< 0.1%
-3.3513513511
 
< 0.1%
-45
0.2%
-4.1910112361
 
< 0.1%
-4.2758620691
 
< 0.1%
-4.51
 
< 0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1225
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.04969870057
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:41.057100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008746355685
Q10.01575839204
median0.0243902439
Q30.04166666667
95-th percentile0.1153846154
Maximum17
Range16.99455041
Interquartile range (IQR)0.02590827462

Descriptive statistics

Standard deviation0.337595074
Coefficient of variation (CV)6.792835026
Kurtosis2296.516337
Mean0.04969870057
Median Absolute Deviation (MAD)0.01069454458
Skewness46.08539806
Sum137.8641954
Variance0.113970434
MonotonicityNot monotonic
2022-05-17T17:33:41.170126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.062518
 
0.6%
0.0277777777817
 
0.6%
0.0238095238116
 
0.6%
0.0909090909115
 
0.5%
0.0833333333315
 
0.5%
0.0344827586214
 
0.5%
0.0294117647114
 
0.5%
0.0357142857113
 
0.5%
0.0192307692313
 
0.5%
0.0212765957413
 
0.5%
Other values (1215)2626
94.7%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
21
 
< 0.1%
1.1428571431
 
< 0.1%
18
0.3%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%
0.53
 
0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1938
Distinct (%)69.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean245.961992
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:41.291557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45
Q1103.3333333
median172.125
Q3278.2375
95-th percentile587.875
Maximum40498.5
Range40497.5
Interquartile range (IQR)174.9041667

Descriptive statistics

Standard deviation808.0807949
Coefficient of variation (CV)3.285388887
Kurtosis2223.352169
Mean245.961992
Median Absolute Deviation (MAD)81.29166667
Skewness44.86093386
Sum682298.5657
Variance652994.5711
MonotonicityNot monotonic
2022-05-17T17:33:41.405582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
869
 
0.3%
608
 
0.3%
758
 
0.3%
1367
 
0.3%
1057
 
0.3%
1977
 
0.3%
827
 
0.3%
2087
 
0.3%
737
 
0.3%
Other values (1928)2696
97.2%
ValueCountFrequency (%)
11
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
11.8751
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%
2082.2258061
< 0.1%
20001
< 0.1%
1903.51
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION

Distinct897
Distinct (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.13545486
Minimum0.2
Maximum177
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-05-17T17:33:41.532611image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile2
Q17.511363636
median13.5
Q322
95-th percentile45.0875
Maximum177
Range176.8
Interquartile range (IQR)14.48863636

Descriptive statistics

Standard deviation14.26329428
Coefficient of variation (CV)0.8323849233
Kurtosis10.00578646
Mean17.13545486
Median Absolute Deviation (MAD)6.666666667
Skewness2.246049339
Sum47533.75179
Variance203.4415637
MonotonicityNot monotonic
2022-05-17T17:33:41.646297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
834
 
1.2%
1333
 
1.2%
732
 
1.2%
1632
 
1.2%
932
 
1.2%
1230
 
1.1%
1429
 
1.0%
18.529
 
1.0%
629
 
1.0%
1529
 
1.0%
Other values (887)2465
88.9%
ValueCountFrequency (%)
0.21
 
< 0.1%
0.253
 
0.1%
0.33333333336
0.2%
0.41
 
< 0.1%
0.40909090911
 
< 0.1%
0.512
0.4%
0.54545454551
 
< 0.1%
0.55555555561
 
< 0.1%
0.57142857141
 
< 0.1%
0.61764705881
 
< 0.1%
ValueCountFrequency (%)
1771
< 0.1%
1051
< 0.1%
1041
< 0.1%
981
< 0.1%
95.51
< 0.1%
94.333333331
< 0.1%
93.333333331
< 0.1%
89.6251
< 0.1%
871
< 0.1%
85.666666671
< 0.1%

Interactions

2022-05-17T17:33:36.668447image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:19.603187image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.949598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.379932image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.692091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.157418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.557162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.991797image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.564746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.884041image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.301358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.870739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.246053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.773473image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:19.711314image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.046612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.473953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.788120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.255448image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.659184image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:28.093820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.658768image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.986064image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.399388image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.969760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.350076image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.874498image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:19.810335image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.142633image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.570848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.886142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.355471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.763206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:28.196830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.757789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.092088image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.501411image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.071784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.455099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.974539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:19.923361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.240663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.665869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.984156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.454493image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.872230image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:28.301160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.854812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.196120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.601433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.171807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.558122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.079569image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.025385image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.345679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.766891image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.086178image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.559516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.984256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:28.409585image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.956834image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.304137image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.706457image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.275830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.667104image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.187600image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.133408image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.451701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.871915image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.193211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.668540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.097281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:28.520594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.060857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.416160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.815481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.383854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.776139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.298630image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.242432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.558726image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.975939image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.301235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.776557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.212307image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:28.633601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.166882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.529193image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.926498image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.495879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.890174image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.409657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.347456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.766780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.081962image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.408259image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.885589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.330333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:28.902598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.271904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.640219image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.037522image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.606903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.003243image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.509681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.440477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.862523image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.176976image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.504281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.983603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.433356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.005621image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.366934image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.744242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.138545image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.706926image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.108285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.618725image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.543500image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:21.967848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.282999image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.612304image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.092057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.547382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.118655image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.472957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.857267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.249570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.816957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.224305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.725750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.643522image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.068866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.385021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.716321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.199081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.659731image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.230680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.574980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:31.968283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.357602image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:34.923981image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.335332image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.830785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.743545image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.169893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.486053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:24.947379image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.334110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.768756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.339697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.676003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.077308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.464626image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.029004image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.444366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:37.941811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:20.847575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:22.275909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:23.591069image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:25.054403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:26.450137image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:27.882773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:29.453722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:30.781018image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:32.191342image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:33.765715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:35.140029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-17T17:33:36.558403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-05-17T17:33:41.755331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-17T17:33:41.918359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-17T17:33:42.078395image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-17T17:33:42.239959image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-17T17:33:38.109851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-17T17:33:38.322899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysamount_invoicesamount_itemsamount_productsavg_ticketamount_returnsavg_recency_daysfrequencyavg_basket_sizeavg_unique_basket_size
00178505391.2137234173329718.15222240.0-1.00000017.00000050.9705880.617647
11130473232.59569139017118.90403536.0-52.8333330.028302154.44444411.666667
22125836705.38215502823228.90250051.0-26.5000000.040323335.2000007.600000
3313748948.259554392833.8660710.0-92.6666670.01792187.8000004.800000
4415100876.003333803292.00000022.0-20.0000000.07317126.6666670.333333
55152914623.302514210210245.32647129.0-26.7692310.040115150.1428574.357143
66146885630.87721362132717.219786399.0-19.2631580.057221172.4285717.047619
77178095411.91161220576188.71983642.0-39.6666670.033520171.4166673.833333
881531160767.9009138194237925.543464474.0-4.1910110.243316419.7142866.230769
99160982005.638776136729.9347760.0-47.6666670.02439087.5714294.857143

Last rows

df_indexcustomer_idgross_revenuerecency_daysamount_invoicesamount_itemsamount_productsavg_ticketamount_returnsavg_recency_daysfrequencyavg_basket_sizeavg_unique_basket_size
2764552217290525.24324041025.1494120.0-13.00.142857202.00000046.000000
276555311478577.4010284325.8000000.0-5.00.33333342.0000001.000000
2766553217254272.44422521122.4325000.0-11.00.166667126.00000050.000000
2767554817232421.52222033611.7088890.0-12.00.153846101.50000015.000000
2768554917468137.00102116527.4000000.0-4.00.40000058.0000002.500000
2769556013596697.04524061664.1990360.0-7.00.250000203.00000066.500000
27705566148931237.85927997316.9568490.0-2.00.666667399.50000036.000000
2771559114126706.13735081547.07533350.0-3.00.750000169.3333334.666667
27725597135211092.39137334352.5112410.0-4.50.300000244.333333104.000000
2773560715060301.84842621202.5153330.0-1.02.00000065.50000020.000000